Goto

Collaborating Authors

 univariate time sery


Multivariate Time series Anomaly Detection:A Framework of Hidden Markov Models

Li, Jinbo, Pedrycz, Witold, Jamal, Iqbal

arXiv.org Artificial Intelligence

In this study, we develop an approach to multivariate time series anomaly detection focused on the transformation of multivariate time series to univariate time series. Several transformation techniques involving Fuzzy C-Means (FCM) clustering and fuzzy integral are studied. In the sequel, a Hidden Markov Model (HMM), one of the commonly encountered statistical methods, is engaged here to detect anomalies in multivariate time series. We construct HMM-based anomaly detectors and in this context compare several transformation methods. A suite of experimental studies along with some comparative analysis is reported.


DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection

Aljaafari, Tala, Kanade, Varun, Torr, Philip, de Witt, Christian Schroeder

arXiv.org Artificial Intelligence

Deploying reinforcement learning (RL) in safety-critical settings is constrained by brittleness under distribution shift. We study out-of-distribution (OOD) detection for RL time series and introduce DEEDEE, a two-statistic detector that revisits representation-heavy pipelines with a minimal alternative. DEEDEE uses only an episodewise mean and an RBF kernel similarity to a training summary, capturing complementary global and local deviations. Despite its simplicity, DEEDEE matches or surpasses contemporary detectors across standard RL OOD suites, delivering a 600-fold reduction in compute (FLOPs / wall-time) and an average 5% absolute accuracy gain over strong baselines. Conceptually, our results indicate that diverse anomaly types often imprint on RL trajectories through a small set of low-order statistics, suggesting a compact foundation for OOD detection in complex environments.


Label-Guided Imputation via Forest-Based Proximities for Improved Time Series Classification

Rhodes, Jake S., Rustad, Adam G., Maia, Sofia Pelagalli, Thacker, Evan, Choi, Hyunmi, Gutierrez, Jose, Rundek, Tatjana, Shaw, Ben

arXiv.org Machine Learning

Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in the context of time series classification, where each time series is associated with a categorical label. We define a means of imputing missing values conditional upon labels, the method being guided by powerful, existing supervised models designed for high accuracy in this task. From each model, we extract a tree-based proximity measure from which imputation can be applied. We show that imputation using this method generally provides richer information leading to higher classification accuracies, despite the imputed values differing from the true values.


Finding the DeepDream for Time Series: Activation Maximization for Univariate Time Series

Schlegel, Udo, Keim, Daniel A., Sutter, Tobias

arXiv.org Artificial Intelligence

Understanding how models process and interpret time series data remains a significant challenge in deep learning to enable applicability in safety-critical areas such as healthcare. In this paper, we introduce Sequence Dreaming, a technique that adapts Activation Maximization to analyze sequential information, aiming to enhance the interpretability of neural networks operating on univariate time series. By leveraging this method, we visualize the temporal dynamics and patterns most influential in model decision-making processes. To counteract the generation of unrealistic or excessively noisy sequences, we enhance Sequence Dreaming with a range of regularization techniques, including exponential smoothing. This approach ensures the production of sequences that more accurately reflect the critical features identified by the neural network. Our approach is tested on a time series classification dataset encompassing applications in predictive maintenance. The results show that our proposed Sequence Dreaming approach demonstrates targeted activation maximization for different use cases so that either centered class or border activation maximization can be generated. The results underscore the versatility of Sequence Dreaming in uncovering salient temporal features learned by neural networks, thereby advancing model transparency and trustworthiness in decision-critical domains.


Causal Discovery-Driven Change Point Detection in Time Series

Gao, Shanyun, Addanki, Raghavendra, Yu, Tong, Rossi, Ryan A., Kocaoglu, Murat

arXiv.org Machine Learning

Change point detection in time series seeks to identify times when the probability distribution of time series changes. It is widely applied in many areas, such as human-activity sensing and medical science. In the context of multivariate time series, this typically involves examining the joint distribution of high-dimensional data: If any one variable changes, the whole time series is assumed to have changed. However, in practical applications, we may be interested only in certain components of the time series, exploring abrupt changes in their distributions in the presence of other time series. Here, assuming an underlying structural causal model that governs the time-series data generation, we address this problem by proposing a two-stage non-parametric algorithm that first learns parts of the causal structure through constraint-based discovery methods. The algorithm then uses conditional relative Pearson divergence estimation to identify the change points. The conditional relative Pearson divergence quantifies the distribution disparity between consecutive segments in the time series, while the causal discovery method enables a focus on the causal mechanism, facilitating access to independent and identically distributed (IID) samples. Theoretically, the typical assumption of samples being IID in conventional change point detection methods can be relaxed based on the Causal Markov Condition. Through experiments on both synthetic and real-world datasets, we validate the correctness and utility of our approach.


Meta-learning and Data Augmentation for Stress Testing Forecasting Models

Inácio, Ricardo, Cerqueira, Vitor, Barandas, Marília, Soares, Carlos

arXiv.org Machine Learning

The effectiveness of univariate forecasting models is often hampered by conditions that cause them stress. A model is considered to be under stress if it shows a negative behaviour, such as higher-than-usual errors or increased uncertainty. Understanding the factors that cause stress to forecasting models is important to improve their reliability, transparency, and utility. This paper addresses this problem by contributing with a novel framework called MAST (Meta-learning and data Augmentation for Stress Testing). The proposed approach aims to model and characterize stress in univariate time series forecasting models, focusing on conditions where they exhibit large errors. In particular, MAST is a meta-learning approach that predicts the probability that a given model will perform poorly on a given time series based on a set of statistical time series features. MAST also encompasses a novel data augmentation technique based on oversampling to improve the metadata concerning stress. We conducted experiments using three benchmark datasets that contain a total of 49.794 time series to validate the performance of MAST. The results suggest that the proposed approach is able to identify conditions that lead to large errors. The method and experiments are publicly available in a repository.


Hinge-FM2I: An Approach using Image Inpainting for Interpolating Missing Data in Univariate Time Series

Saad, Noufel, Nadir, Maaroufi, Mehdi, Najib, Mohamed, Bakhouya

arXiv.org Machine Learning

Accurate time series forecasts are crucial for various applications, such as traffic management, electricity consumption, and healthcare. However, limitations in models and data quality can significantly impact forecasts accuracy. One common issue with data quality is the absence of data points, referred to as missing data. It is often caused by sensor malfunctions, equipment failures, or human errors. This paper proposes Hinge-FM2I, a novel method for handling missing data values in univariate time series data. Hinge-FM2I builds upon the strengths of the Forecasting Method by Image Inpainting (FM2I). FM2I has proven effective, but selecting the most accurate forecasts remain a challenge. To overcome this issue, we proposed a selection algorithm. Inspired by door hinges, Hinge-FM2I drops a data point either before or after the gap (left/right-hinge), then use FM2I for imputation, and then select the imputed gap based on the lowest error of the dropped data point. Hinge-FM2I was evaluated on a comprehensive sample composed of 1356 time series, extracted from the M3 competition benchmark dataset, with missing value rates ranging from 3.57\% to 28.57\%. Experimental results demonstrate that Hinge-FM2I significantly outperforms established methods such as, linear/spline interpolation, K-Nearest Neighbors (K-NN), and ARIMA. Notably, Hinge-FM2I achieves an average Symmetric Mean Absolute Percentage Error (sMAPE) score of 5.6\% for small gaps, and up to 10\% for larger ones. These findings highlight the effectiveness of Hinge-FM2I as a promising new method for addressing missing values in univariate time series data.


Lag Selection for Univariate Time Series Forecasting using Deep Learning: An Empirical Study

Leites, José, Cerqueira, Vitor, Soares, Carlos

arXiv.org Machine Learning

Most forecasting methods use recent past observations (lags) to model the future values of univariate time series. Selecting an adequate number of lags is important for training accurate forecasting models. Several approaches and heuristics have been devised to solve this task. However, there is no consensus about what the best approach is. Besides, lag selection procedures have been developed based on local models and classical forecasting techniques such as ARIMA. We bridge this gap in the literature by carrying out an extensive empirical analysis of different lag selection methods. We focus on deep learning methods trained in a global approach, i.e., on datasets comprising multiple univariate time series. The experiments were carried out using three benchmark databases that contain a total of 2411 univariate time series. The results indicate that the lag size is a relevant parameter for accurate forecasts. In particular, excessively small or excessively large lag sizes have a considerable negative impact on forecasting performance. Cross-validation approaches show the best performance for lag selection, but this performance is comparable with simple heuristics.


Time Series Data Augmentation as an Imbalanced Learning Problem

Cerqueira, Vitor, Moniz, Nuno, Inácio, Ricardo, Soares, Carlos

arXiv.org Machine Learning

Recent state-of-the-art forecasting methods are trained on collections of time series. These methods, often referred to as global models, can capture common patterns in different time series to improve their generalization performance. However, they require large amounts of data that might not be readily available. Besides this, global models sometimes fail to capture relevant patterns unique to a particular time series. In these cases, data augmentation can be useful to increase the sample size of time series datasets. The main contribution of this work is a novel method for generating univariate time series synthetic samples. Our approach stems from the insight that the observations concerning a particular time series of interest represent only a small fraction of all observations. In this context, we frame the problem of training a forecasting model as an imbalanced learning task. Oversampling strategies are popular approaches used to deal with the imbalance problem in machine learning. We use these techniques to create synthetic time series observations and improve the accuracy of forecasting models. We carried out experiments using 7 different databases that contain a total of 5502 univariate time series. We found that the proposed solution outperforms both a global and a local model, thus providing a better trade-off between these two approaches.


MST-GAT: A Multimodal Spatial-Temporal Graph Attention Network for Time Series Anomaly Detection

Ding, Chaoyue, Sun, Shiliang, Zhao, Jing

arXiv.org Artificial Intelligence

Multimodal time series (MTS) anomaly detection is crucial for maintaining the safety and stability of working devices (e.g., water treatment system and spacecraft), whose data are characterized by multivariate time series with diverse modalities. Although recent deep learning methods show great potential in anomaly detection, they do not explicitly capture spatial-temporal relationships between univariate time series of different modalities, resulting in more false negatives and false positives. In this paper, we propose a multimodal spatial-temporal graph attention network (MST-GAT) to tackle this problem. MST-GAT first employs a multimodal graph attention network (M-GAT) and a temporal convolution network to capture the spatial-temporal correlation in multimodal time series. Specifically, M-GAT uses a multi-head attention module and two relational attention modules (i.e., intra- and inter-modal attention) to model modal correlations explicitly. Furthermore, MST-GAT optimizes the reconstruction and prediction modules simultaneously. Experimental results on four multimodal benchmarks demonstrate that MST-GAT outperforms the state-of-the-art baselines. Further analysis indicates that MST-GAT strengthens the interpretability of detected anomalies by locating the most anomalous univariate time series.